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We consider the population genetics problem: how long does it 
CO . take before some member of the population has m specified muta- 

(N . tions? The case m = 2 is relevant to onset of cancer due to the inac- 

tivation of both copies of a tumor suppressor gene. Models for larger 
m are needed for colon cancer and other diseases where a sequence 
of mutations leads to cells with uncontrolled growth. 



1. Introduction. It has long been known that cancer is the end result of 
several mutations that disrupt normal cell division. Armitage and Doll [1] 
did a statistical analysis of the age of onset of several cancers and fit power 
laws to estimate the number of mutations. Knudson [15] discovered that the 

■ incidence of retinoblastoma (cancer of the retina) grows as a linear function 
of time in the group of children who have multiple cancers in both eyes, 

ly-^ I but as a slower quadratic function in children who only have one cancer. 

■ Based on this, Knudson proposed the concept of a tumor suppressor gene. 
, Later it was confirmed that in the first group of children, one copy is already 
' inactivated at birth, while in the second group both copies must be mutated 

before cancer occurs. Since that time, about 30 tumor suppressor genes have 
I been identified. They have the property that inactivating the first copy does 

not cause a change, while inactivating the second increases the cells' net 
reproductive rate, which is a step toward cancer. 

There is now considerable evidence that colon cancer is the end result of 
several mutations. The earliest evidence was statistical. Luebeck and Mool- 
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gavakar [18] fit a four-stage model to the age-specific incidence of colorectal 
cancers in the Surveillance, Epidemiology, and End Results registry, which 
cover 10 percent of the US population. Calabrese et al. [5] examined 1022 
colorectal cancers sampled from nine large regional hospitals in southeastern 
Finland. They found support for a model with five or six oncogenic muta- 
tions in individuals with hereditary risk factors and seven or eight mutations 
in patients without. 

Over the last decade, a number of studies have been carried out to identify 
the molecular pathways involved in the development of colorectal cancer. See 
Jones et al. [14] for a recent report. The process is initiated when a single 
colorectal cell acquires mutations inactivating the ACP//3-catenin pathway. 
This results in the growth of small benign tumor (adenoma). Subsequent 
mutations in a short list of other pathways transform the adenoma into 
a malignant tumor (carcinoma), and lead to metastasis, the ability of the 
cancer to spread to other organs. 

In this paper, we propose a simple mathematical model for cancer de- 
velopment in which cancer occurs when one cell accumulates m mutations. 
Consider a population of fixed size A^. Readers who are used to the study 
of the genetics of diploid organisms may have expected to see 2N here, but 
our concern is for a collection of N cells. We choose a model in which the 
number of cells is fixed because organs in the body are typically of constant 
size. We assume that the population evolves according to the Moran model, 
which was first proposed by Moran [19]. That is, each individual lives for 
an exponentially distributed amount of time with mean one, and then is 
replaced by a new individual whose parent is chosen at random from the 
individuals in the population (including the one being replaced). For more 
on this model, see Section 3.4 of [11]. 

In our model, each individual has a type < j < m. Initially, all individ- 
uals have type 0. In the usual population genetics model, mutations only 
occur at replacement events. We assume instead that types are clonally in- 
herited, that is, every individual has the same type as its parent. However, 
thinking of a collection of cells that may acquire mutations due to radiation 
or other environmental factors, we will suppose that during their lifetimes, 
individuals of type j — 1 mutate to type j at rate uj. We call such a mu- 
tation a type j mutation. Let Xj{t) be the number of type j individuals at 
time t. For each positive integer m, let = inf{t :Xm(i) > 0} be the first 
time at which there is an individual in the population of type m. Clearly, ri 
has the exponential distribution with rate Nui. Our goal is to compute the 
asymptotic distribution of for m>2 as N ^ oo. 

We begin by considering the case m = 2 and discussing previous work. 
Schinazi [21, 22] has considered related questions. In the first paper, he 
computes the probability that in a branching process where individuals have 
two offspring with probability p and zero with probability 1 — p, a mutation 
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will arise before the process dies out. In the second paper, he uses this to 
investigate the probability of a type 2 mutation when type cells divide a 
fixed number of times with the possibility of mutating to a type 1 cell that 
begins a binary branching process. 

More relevant to our investigation is the work of Komarova, Sengupta 
and Nowak [17], Iwasa, Michor and Nowak [13] and Iwasa et al. [12]. Their 
analysis begins with the observation that while the number of mutant in- 
dividuals is o{N), we can approximate the number of cells with mutations 
by a branching process in which each individual gives birth at rate one and 
dies at rate one. Let Z denote the total progeny of such a branching process. 
Since the embedded discrete time Markov chain is a simple random walk, 
we have (see, e.g., page 197 in [7]) 

P(Z>n) = 2-2"f2n\ 1 ^ 

If we ignore interference between successive new type 1 mutations, then 
their total progeny Zi, Z2, . . . are i.i.d. variables in the domain of attraction 
of a stable law with index 1/2, so maxj<M Zi and Zi + • • • + Zm will be 
0{M'^). Therefore, we expect to see our first type 2 mutation in the family 
of the Mth type 1 mutation, where M = 0(l/y^). Standard results for 
simple random walk imply that the largest of our first M families will have 
0{M) type 1 individuals alive at the same time, so for the branching process 
approximation to hold, we need l/y^<C A^, where here and throughout 
the paper, f{N) < g{N) means that f{N)/g{N) ^ as iV^oo. Type 1 
mutations occur at rate Nui, so a type 2 mutation will first occur at a time 
of order l/Nui,/u2. 

As long as the branching process approximation is accurate, the amount of 
time we have to wait for a type 1 mutation that will have a type 2 individual 
as a descendant will be approximately exponential, since mutations occur 
at times of a Poisson process with rate Nui and the type 1 mutations that 
lead to a type 2 are a thinning of that process in which points are kept with 
probability ~ ^/u2, which is 0(1/M), where here and throughout the paper, 
f{N) ~ g{N) means that f{N)/g{N) 1 as N ^00. The duration of the 
longest of M type 1 families is 0{M), so the time between when the type 
1 mutation occurs and when the type 2 descendant appears is 0(l/y^). 
This will be negligible in comparison to as long as Nui ^ 1, so 

the waiting time for the first type 2 individual will also be approximately 
exponential. This leads to a result stated on pages 231-232 of Nowak's book 
[20] on Evolutionary Dynamics. If Xj ^fui<^ N <^l/ui, then 

(1.1) P(T2<t)«l-exp(-Afui^i). 

Figure 1 shows the distribution of T2 • Nuiy/u2 in 10,000 simulations of 
the Moran model when = 10^ and ui=U2 = 10~^. Here, A^ui = 0.1 and 
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Rgscaled Time 

Fig. 1. Distribution of T2 ■ Nui^/u2 — 1000 m 10,000 simulations when N = 10'' and 
IQ-" . iViii =0.1 and = 10, so as (1.1) predicts the scaled waiting time is 

approximately exponential. 

N^Ju2 = 10, so as the last result predicts, the scaled waiting time is approx- 
imately exponential. 

We do not refer to the result given in (1.1) as a theorem because their argu- 
ment is not completely rigorous. For example, the authors use the branching 
process approximation without proving it is valid. However, this is a minor 
quibble, since as the reader will see in Section 2, it is straightforward to fill 
in the missing details and establish the following more general result. 

Theorem 1. Suppose that Nui ^ A e [0, oo), U2 ^ and N y/u2 oo 
as N ^ oo. Then T2 ■ Nuiy/u2 converges to a limit that has density function 

/ /■* \ 1 - e"^^/-^ 

/2(t) = /i(t)exp|^-y^ h{s)dsj where h{s) = Y^—^j^: 

ifX>0 and f2{t) = e^* i/ A = 0. 

Here, h{t) is the hazard function, that is, if we let F2{t) = exp(— /q h{s) ds) 
be the tail of the distribution, then h{t) = f2{t)/F2{t). Figure 2 shows the 
distribution of T2 ■ Nui^/u2 in 10,000 simulations of the Moran model when 
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Rgscaled Time 

Fig. 2. Distribution of T2 ■ Nu\^/u2 = 1000 m 10,000 simulations when N = 10"^, 
ui = 10~^ and U2 ~ 10~*. A'^iti = 1 and Ny/u2 ~ 0.1, so the limit is not exponential, 
hut is fit well by the result in Theorem 1. 

N = 10^, ui = and U2 = 10~^. Nui = 1 so the limit is not exponential, 
but Theorem 1 gives a good fit to the observed distribution. 

Before turning to the case of m mutations, we should clarify one point. 
In our model, mutations occur during the lifetime of an individual, but in 
the following discussion, we will count births to estimate the probability a 
desired mutation will occur. This might seem to only be appropriate if muta- 
tions occur at birth. However, since each individual lives for an exponential 
amount of time with mean 1, the number of "man- hours" /(f''Xi(s)ds be- 
fore the family dies out at time Tq is roughly the same as the number of 
births. In any case, the following discussion is only a heuristic that helps 
explain the answer, but does not directly enter into its proof. 

To extend the analysis to the m-stage waiting time problem, suppose 
M distinct type 1 mutations have appeared. If the family sizes of these M 
mutations can be modeled by independent branching processes, the total 
number of offspring of type 1 individuals will be O(M^). Because each type 
1 individual mutates to type 2 at rate U2, there will be 0{M'^U2) mutations 
that produce type 2 individuals. The total progeny of these individuals will 
consist of 0{M^u^) type 2 individuals. We can expect to see our first type 
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3 individual when M'^u'^ = 0{l/u-i) or M = 0{u2 ^^'^u^ ^^^). Thus, for the 

branching process approximation to hold, we need U2 ^ N. Since 

type 1 mutations occur at rate Nui, the expected waiting time will be of 
order 

To help develop a good mental picture, it is instructive to consider the 
numerical example in which N = 10^, ui = 10~^, U2 = 10~^ and U3 = 10~^. 
By the reasoning above, we will first see a type 3 mutation when the number 
of type 2's is of order 100 = l/y/u^, since in this case there will be of order 
10,000 = l/u3 type 2 births before the family dies out. To have a type 2 
family reach size 100, we will need 100 mutations from type 1 to type 2, 
and for this we will need of order 100/^2 = 10^ type 1 births, which will in 
turn occur if the type 1 family reaches size of order 10^/^ 3152. Note that 
^2it) ^ s-'^d within the time that the large type 1 family exists, lOO's 

of type 2 families will be started and die out. This difference in the time and 
size scales for the processes Xi{t) is a complicating factor in the proof, but 
ultimately it also allows us to separate the type I's from types 2 to m and 
use induction. 

Extrapolating the calculation above to m stages, we let 
(1 9\ _ 1/2 1/4 1/2'"-^' 

for 1 < j <m, and set r^.m = 1 and ro,m = uiri^m- Let qj^m be the probability 
a type j individual gives rise to a type m descendant. We will show that 
Qj,m ~ ^j,m; so we will need of order l/rj,m mutations to type j before time 



Theorem 2. Fix an integer m>2. Suppose that: 

(i) Nui^O. 

(ii) For j = 1, . . . ,m — l, there is a constant bj > such that uj+i/uj > bj 
for all N. 

(iii) There is an a> so that N"'Um 0. 

(iv) Nri^rn^oo. 
Then for all t> 0, 

(1.3) lim P{Tm>t/Nro^m)=eM-t)- 

As discussed above, condition (iv) which says 1/ri^rn <C is needed for 
the branching process assumption to be valid, and condition (i) is needed 
for the waiting time to be exponential, because if (i) fails then the time 
between the type 1 mutation that will have a type m descendant and the 
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birth of the type m descendant cannot be neglected. If Uj = fj. for all j, (ii) is 
trivial. In this case ri^m, = ^"■'^'^\ where a(m) = 1 — 2~("^~^). Conditions (i) 
and (iv) become A^~^/'^(™) ^ /i ^ A^~^, and when condition (i) is satisfied, 
(iii) holds. 

Conditions (ii) and (iii) are technicalities that allow us to prove the 
result without having to suppose that Uj = ^, which would not be natu- 
ral in modeling cancer. In the presence of (ii), condition (iii) ensures that 
maXjXm'Uj *C N~°' for some a > 0. This is natural because even in the late 
stages of progression to cancer, the per cell division mutation probabilities 
are small. 

Condition (ii) is motivated by the fact that in most cancers we expect 
Uj to be increasing in j. The simple extension of this given in (ii) is useful 
so that we do not rule out some interesting special cases. In modeling the 
tumor suppressor genes mentioned earlier, it is natural to take ui = 2/i and 
U2 = /X, that is, at the first stage a mutation can knock out one of the two 
copies of the gene, but after this occurs, there is only one copy subject to 
mutation. A case with uxju^ = 30 occurs in Durrett and Schmidt's study of 
regulatory sequence evolution [9]. 

Condition (iv) ensures that an individual of type m will appear before any 
type 1 mutation achieves fixation. In the case m = 2, Iwasa et al. [13] called 
this stochastic tunneling. A given type 1 mutation fixates with probability 
1/N and type 1 mutations occur at rate approximately Nui, so fixation 
occurs before a type m individual appears if Nri^m 0, and then once a 
type 1 mutation fixates, the problem reduces to the problem of waiting for 
m — 1 additional mutations. In the borderline case considered in the next 
result, either a type m individual could appear before fixation, or a type 
m mutation could be achieved through the fixation of type 1 individuals 
followed by the generation of an individual with m — 1 additional mutations. 

Theorem 3. Fix an integer m > 2. Assume conditions (i), (ii) and (iii) 
from Theorem 2 hold. If (Nri^rn)'^ ^7 > 0, and we let 



then for all t>0, limTv-^oo P{uiTm > t) = exp(— at) . 

Figure 3 shows the distribution of U1T2 in 10,000 simulations of the Moran 
model when = 10^, ui = 10"^ and U2 = 10~^ . Nui = 0.1 and N^= 1, 
so the assumptions of Theorem 3 hold with 7 = 1. Numerically evaluating 
the constant gives a = 1.433 and as the figure shows the exponential with 
this rate gives a reasonable fit to the simulated data. 



(1.4) 
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Theorem 3 will be proved by reducing the general case to a two-type model 
with ui = ui and U2 = U2q2,m ~ m- We will show that it suffices to do 
calculations for a model in which type 1 mutations are not allowed when the 
number of type 1 individuals Xi (t) is positive. In this case, if we start with 
Xi{0) = Ne then N~^Xi{Nt) Zt where Zt is the Wright-Fisher diffusion 
process with infinitesimal generator x(l — x)(fi /dx^ . When Xi{Nt) = Nx, 
mutations to type 2 that eventually lead to a type m individual occur at 
rate approximately 

N ■ Nx- U2q2,m ~ N'^rf .^x jx, 

so, if we let u{x) be the probability that the process Zt hits before reaching 
1 or generating a type m mutation, then u{x) satisfies 

(1.5) x{l-x)u"{x)--fxu{x) = 0, n(0) = l, u(l)=0. 

The constant a = lim£„»o(l ~ u{£))/e. Its relevance for the problem is that 
starting from a single type 1 individual, the probability of reaching N or 
generating a type m mutation is ^a/N . Since mutations to type 1 occur at 
rate ~ Nui, the waiting time is roughly exponential with rate uia. 



1 .5 1 1 1 1 1 1 r 




Rgscaled Time 

Fig. 3. Distribution of uiT2 when N — 10^ , ui — 10^^ and U2 — 10^". A^iti — 0.1 and 
Ny/u2 = 1 so we are in the regime covered by Theorem 3. The constant 7 = 1 so a = 1.433. 
As the graph shows the exponential distribution with rate a gives a reasonbly good fit to 
the simulated data. 
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One can check (see Lemma 6.9 below) that (1.5) can be solved by the 
following power series around x = 1: 

oo 

(1.6) uM.eg^^d-.)^ 

Picking c so that n(0) = 1, it follows that a has the form given in (1.4). 
Another approach to solving (1.5) is to use the Feynman-Kac formula; see 
formula (3.19.5b) on page 225 of [4]. 

We do not discuss in this paper the case Nui — > oo. We instead refer the 
reader to [23], where asymptotic results in this regime are obtained in the 
special case when uj = // for all j. 

The rest of this paper is organized as follows. In Section 2, we give the 
proof of Theorem 1. In Section 3, we collect some results for a two- type 
population model that will be useful later in the paper. In Section 4, we 
calculate by induction the probability that a given type 1 individual has 
a type m descendant. In Section 5, we combine this result with a Poisson 
approximation result of Arratia, Goldstein and Gordon [2] to prove Theorem 
2. Theorem 3 is proved in Sections 6 and 7. Throughout our proofs, C denotes 
a constant whose value is unimportant and will change from line to line. 

2. Proof of Theorem 1. If we let Xi(t) be the number of type 1 individ- 
uals at time t then 

(2.1) P{T2>t) = Eexp(^-U2 Xi{s)ds 

because at time s, there are -'^i(s) individuals each experiencing type 2 mu- 
tations at rate U2- We will compare Xi{t) with a continuous-time branching 
process with immigration, Y(t). When Xi{t) = k, type 1 mutations occur 
at rate (A^ — k)ui, while birth events in which a type 1 individual replaces 
a type individual occur at rate k{N — k)/N, so before time T2, we have 
jumps 

N — k 

A: — > /c + 1 at rate {k + Nui] 



k ^ k — 1 at rate k ■ 



N 
N-k 



N 

In the branching process with immigration, Y{t), we have jumps 

k ^ k + 1 at rate k + Nui, 

— > /e — 1 at rate k. 

Therefore, up to time T2, the process {Xi{t),t > 0} is a time-change of 
{Y{t),t > 0}, in which time runs slower than in the branching process by a 
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factor of {N - Xi{t))/N. That is, if 

then the two processes can be coupled so that Xi{t) = Y(T{t)), for all t > 0. 
The time change will have little effect as long as Xi{t) is o{N). The next 
lemma shows that on the relevant time scale, the number of mutants stays 
small with high probability. 

Lemma 2.1. Fix t > 0, e > 0, and let Mt = maxo<5<t/(Ar„^y^) 
We have 

lim P(Mt> eN) = 0. 

N^oo 

Proof. Since mutant individuals give birth and die at the same rate, 
the process {Xi{s),s > 0} is a sub martingale. Because the rate of type 1 
mutations is always bounded above by Nui, we have EXi{s) < Nuis for all 
s. By Doob's maximal inequality, 



, EXi(t/Nui./u^) t 
P{Mt > eN) < — < 



eN - eN^' 

which goes to zero as — > oo, since Ny/u2^ oo. □ 



Using the time change in (2.1), we have 



P{t2 > t/Nui^) = Eexpi-U2 Y{T{s))ds 

Changing variables r = T(s), which means s = U{r), where U = T~^, ds = 
U'{r) dr and the above is 

P{T2>t/Nui^) = Eexpi^-U2 Y{r)U'{r)dr]. 

When Mt < Ne, 1 > T'{t) > 1 — £, so the inverse function has slope 1 < 
U'{r) < 1/(1 — e). Thus, in view of Lemma 2.1, it is enough to prove the 
result for the branching process, Y(t). 

Use Q to denote the distribution of {Y{t),t > 0}, and let Qi denote the 
law of the process starting from a single type 1 and modified to have no 
further mutations to type 1. We first compute g2{t) = Qi(t2 < t). Wodarz 
and Komarova [24] do this, see pages 37-39, by using Kolmogorov's forward 
equation to get a partial differential equation 

^(t,y) = (y2-(2 + u2)y+l)||(t,2/) 
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for the generating function 4>{t,y) = ^^ZjQii^ii^) = Ji^2(i) = ^)y-^ of the 
system in which type 2's are not allowed to give birth or die. They use the 
method of characteristics to reduce the PDE to a Riccati ordinary differential 
equation. To help readers who want to follow their derivation, we note that 
the last equation on page 38 is missing a factor of j in the last term and in 
the change of variables from y to z on page 39, 2 should be r. 

Here, we will use Kolmogorov's backward differential equation to derive 
an ODE, which has the advantage that it generalizes easily to the m stage 
problem. By considering what happens between time and /i, 

g2{t + h)= g2{t)[l - (2 + U2)h] + h[2g2{t) - g2{tf] + /i • + txa/i • 1 + o(/i), 

where the four terms correspond to nothing happening, a birth, a death and 
a mutation of the original type 1 to type 2. Doing some algebra and letting 

(2.2) g2{t) = -U2g2{t)-g2{tf + U2. 

If we let ri > r2 be the solutions of + U2X — U2 = 0, that is, 



-U2 ±\ ul + 4ti2 

(2.3) n = y ^ 

we can write this as 

g2{t) = -{g2{t)-ri){g2{t)-r2). 

Now 52(00) be the probability that a type 2 offspring is eventually generated 
in the branching process. Letting t — > 00 in (2.2) and noticing that 1 1— > 52 (i) 
is increasing implies 92 (^) ~^ thaX. 

= -^252(00) - 52(00)^ + M2, 

so < g2{t) < Ti for all t and we have 

^_ 92it) _ 1 f g'2{t) ^ gm 



{n - g2{t)){g2{t) - r2) n - r2\g2{t) - r2 ri-g2{t) 
Integrating 

\n{g2{t) - r2) - ln(ri - g2{t)) = (n - r2)t - \uA, 
where A is a constant that will be chosen later, so we have 

n -g2[t) 

A little algebra gives 

_ n +^r-2e('^2-n)i 
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We have 52(0) = 0, so A = —ri/r2 and 

ri(l-e('^2-ri)t) 



52 (t) 



1 - (ri/r2)e(''2-^i)*' 

To prepare for the asymptotics note that (2.3) and the assumption that 

U2 ^ imply that ri — r2 = \Ju2 + 4n2 ~ 2y^, ri ~ y/u2 and ri/r2 ^ —1 
so 

. yn|(l-e2V^ 
1 + ^-2^,t 

or to be precise, if ty/u2^ s, then 

1 - 

(2.4) g2it)^V^ 



1 + e- 



-2s ■ 



Lemma 2.2. The waiting time for the first type 2 in the branching pro- 
cess with immigration when each type 1 individual experiences mutations at 
rate Nui satisfies 

(2.5) Q{T2<t) = l-exp(^-Nui Qi(r2<s)ds^. 

Proof. Type 1 mutations are a Poisson process with rate Nui. A point 
at time t — s is a success, that is, produces a type 2 before time t with 
probabihty Qi{t2 < s). By results for thinning a Poisson process, the number 
of successes by time t is Poisson with mean Nui /q Qi{t2 < s) ds. The result 
follows from the observation that Q{t2 < t) is the probability of at least one 
success in the Poisson process. □ 

To find the density function, we recall 52 (i) = Qi{t2 < t) and differentiate 
to get 

Nuig2{t)exp(^-Nui g2{s)ds^. 

Changing variables the density function /2 of T2 • Nui^/u2 is given by 

g2{t/Nui^) ( ft/NmV^ ^ 

f2{t) = ^ exp -iVui / g2[s)ds 

VU2 V Jo y 

Changing variables r = sNui^/u2 in the integral the above is 

. ... g2{t/Nui^) ( ft g2{r/Nui^) 
f2[t) = ^ exp - / dr). 



U2 \ Jo JU2 
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If A^ui — > 0, then (2.4) implies that the above converges to exp(— t). Nui 
A, the hmit is /i(t) exp(— /q /i(s) ds) where 

1 _ e-'^s/\ 

which completes the proof of Theorem 1. 

3. A two-type model. We collect here some results for a simple two-type 
population model, which we call model Mq. We assume that all individuals 
are either type or type 1, and the population size is always A^. There are 
no mutations, and the population evolves according to the Moran model, so 
each individual dies at rate 1 and then is replaced by a randomly chosen 
individual in the population. Usually we will assume that the process starts 
with just one type 1 individual at time zero, but occasionally we will also 
need to consider starting the process with j type 1 individuals. Denote by 
Pj and Ej probabilities and expectations when the process is started with 
j type 1 individuals, and write P = Pi and E = Ei. Let X{t) denote the 
number of type 1 individuals at time t. 

Let Tfc = mi{t : X{t) = k} be the first time at which there are k type 1 
individuals, and let T = mm{TQ,T]\f} be the first time at which all individ- 
uals have the same type. Let be the amount of time for which there are 
k type one individuals, which is the Lebesgue measure of {t < T : X{t) = k}. 
Let Rk be the number of times that the number of type 1 individuals jumps 
to k from k — 1 or k + 1. Let R=l + J2k=i be the total number of births 
and deaths of type 1 individuals. Durrett and Schmidt [8] studied this model 
and showed that 

(3.1) ^[^^l^°<^-] = |^ 

and 

2k{N-k) 



(3.2) E[Rk\TN<To] 



N 



Equation (3.1) is (16) of [8], while (3.2) comes from the beginning of the 
proof of Lemma 3 in [8] . 

Because P{Tn < Tq) = l/N, it follows from (3.1) and (3.2) that 

(3.3) E[R,] - ~ < + -^[-^^-1^^ < ^o] = ^(^ - < o 
and, therefore, 

N~l 

(3.4) E[R] = 1 + ^ E[Rk] < 2N. 

k=l 
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(3.7) < E^[Lk\n < To] = 'I, = 1, 



If 1 < J ^ — 1 ; then letting A denote the event that there are at least j 
type 1 individuals at some time, (3.4) gives 

(3.5) E,[R] = E,[R1a] < = jE[RlA] < jE[R] < 2jN. 

Turning to the quantities L^,, note that when there are k type 1 individuals, 
births and deaths are each happening at rate k{N — k)/N, so the number 
of type 1 individuals changes again after an exponential time with mean 
N/[2k{N -k)]. Therefore, (3.3) gives 

(3.6) ElL,] = ^j^^Em = l^ 

Since Pj{Tk <Tq) = j/k for 1 < j < A^, we have 

EijLk] 
Pi{Tk<To) 

where to emphasize the change in initial condition, we have written E as 
El. Since T = J2k=i it also follows from (3.6) that 

(3.8) E[T]=Y.-<C\ogN 

k=i 

and it follows from (3.7) that for j = 1, . . . , — 1, 

(3.9) E,[T]<N. 

Finally, we will use branching process theory to obtain the following com- 
plement to (3.8). 

Lemma 3.1. There exists a constant C such that P{T > t) < C/t for all 
0<t<N. 

Proof. Consider a continuous-time branching process started with one 
individual in which each individual dies at rate one and gives birth at rate 
one. Let T' be the time at which the process becomes extinct. By a theorem 
of Kolmogorov [16], proved in Section 1.9 of [3], and the fact that a Markovian 
continuous-time branching process can be reduced to a discrete time Galton- 
Watson process by only examining it at integer times, we see that there is 
a constant C such that P{T' > t) < C'/t for ah t > 0. 

When there are k individuals in the branching process, births and deaths 
happen at rate k. When there are k individuals in the model Mq, births 
and deaths happen at rate k{N — k)/N, which is at least k/2 as long as 
k < N/2. Since the probability that the number of individuals in model Mq 
ever exceeds N/2 is at most 2/iV, we have P{T > t) < 2C'/t + 2/N for all t, 
which implies the result. □ 
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4. The probability of a type m descendant. We now consider model Mi , 
which evolves in the same way as the process described in the Introduction 
except that initially there is one type 1 individual and — 1 type indi- 
viduals, and no further type 1 mutations occur. The number of individuals 
of nonzero type in model Mi therefore evolves exactly like the number of 
type 1 individuals in model Mq, defined at the beginning of the previous 
section, but in model Mi mutations to types greater than one are possible. 
The probability, which we denote by Qm, that a type m individual is even- 
tually born in model Mi is the same as the probability that a given type 
one individual in the process described in the Introduction has a type m 
descendant. Our main goal in this section is to prove the following result. 

Proposition 4.1. Fix an integer m>2. Assume conditions (ii), (iii) 
and (iv) of Theorem 2 hold. Then ~ ?"i,m- 

We will use Proposition 4.1 to prove Theorem 2. To prove Theorem 3, 
we will need the following corollary. Here we denote by qj^m the probability 
that a type m individual eventually appears in a process with initially one 
type j individual, — 1 type individuals, and mutations to type 1 are not 
allowed. 



Corollary 4.1. Fix an integer m > 2. Assume conditions (ii) and (iii) 
of Theorem 2 hold and that (Nri^rn)'^ ^ 7 > 0. Then q2,m ~ f2,m- 

Proof. We apply the m — 1 case of Proposition 4.1, with W3, . . . , Um in 
place of U2, ■ ■ ■ ,Um-i- Since we are assuming (ii) and (iii), we need only to 
show that Nr2,m — > 00. However, (ii) and (iii) imply 

Ar 1/2 1/4 l/2'"-2 

Nr2,m _ ■■■U^ ^ ^1/2^1/4 /,l/2™-% -l/2'-l ^ 

Mr-, ~ 1/2 1/4 l/2™-2 1/2^-1 ^"2 ' ' ' '^rn-l "m ^ ^■ 

-'^'l.m U2 % •••M^_i lim 

This result and the assumption (Nri^m)'^ — > 7 > imply Nr2^m -^00. □ 

We will prove Proposition 4.1 using a branching process approximation. 
We will approximate model Mi by a continuous-time multi-type branching 
process in which individuals of type 1 < j <m die at rate 1, give birth at 
rate 1 and mutate to individuals of type j -|- 1 at rate Uj+i. Let pj^m be the 
probability that a type j individual eventually has a descendant of type m 
in the branching process and let pm = Pi,m- 

Lemma 4.1. If conditions (ii) and (iii) of Theorem 2 hold, then pj^m ^ 
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Proof. We proceed by induction starting at j = m and working down 
to j = 1. Clearly, Pm^m = 1, so the result is valid for j = m. Now assume the 
result is true for j + 1. By conditioning on the first event in the branching 
process, it follows that 



Multiplying by 2 + Uj+i and rearranging, we get „j + hpj 
0, where b = Uj^i. The only positive solution is 



-6+ J62 + 4uj+iPj+i, 
(4.1) Pj,m = ^ 



Calculus tells that for h> 

I r r- r'"^^ 1 , ^ 

so, we have 



(4.2) 



4^Mj + lPj + l,„^ 

Conditions (ii) and (iii) imply that Wj+i <C and, therefore, that 

^ttj+irj+i^m ^> 6 = lij+i. Since Pj+i,m ~ '"i+i.m by the induction hypoth- 
esis, it follows from (4.1) and (4.2) that p^, ,m ~ ^lij+iTj-i-i^m- The lemma 
follows by induction. □ 

Remark. One gets the same result for a number of other variants of 
the model. We leave it to the reader to check that Lemma 4.1 holds when 
mutation only occurs at birth. To prepare for the proof of Lemma 4.7, we 
will now show that it holds when type j's give birth to type j's at rate one 
and to type j + I's at rate uj+i. In this case, the first equation is 

^— (2p,,^-p2 ) + ^i±L 



P3,m = —^{2p,,m-piJ + :^j^iPj,m+PHl,m-p,,mP,+l,m) 



and rearranges to become p2 ^ + Uj+iPj+i^rnPj,m — 'Uj+iPj+i,m = 0. Taking 
6 = Uj+ipj+i^m, the proof goes as before. 



We will now prove Proposition 4.1 by induction. We begin with the case 
m = 2, in which the comparison with the branching process is straightfor- 
ward. 
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Lemma 4.2. Under the assumptions of Proposition 4-1 with m = 2, we 

u 1/2 
nave q2 ~ ri^2 = ^2 • 

Proof. If we track the number of type 1 individuals in model Mi before 
the first type 2 mutation occurs, upward and downward jumps occur at the 
same rate, which if there are k type 1 individuals is k{N — k)/N. For the 
branching process, when there are k type 1 individuals, upward and down- 
ward jumps occur at rate k. Therefore, the embedded jump chain (which 
gives the sequence of states visited by the continuous-time chain) is a sim- 
ple random walk Sn with = 1 both for model Mi and for the branching 
process. Therefore, writing p2 as a function of the underlying mutation rate, 
we claim that for any L, 

(4.3) P2{U2) - 1/N <q2< P2{U2N/{N - L)) + l/L. 

The first inequality follows from the fact that unless the number of type 1 
individuals in model Mi reaches A^, which happens with probability 
model Ml has the same embedded jump chain as the branching process and 
jumps more slowly. For the second inequality, we note that the probability 
the Moran model reaches height L is l/L. When this does not occur, the 
Moran model always jumps at rate at least {N — L)/N times the branching 

1 /2 1 /2 

process rate. Lemma 4.1 gives ^2(^2) ~ ""2 • Condition (iv) gives — > 

1 /2 

00, SO we can choose L such that L/N and — > 00. Under these 

1/2 

conditions, (4.3) implies q2 ~ U2 ■ □ 

For the rest of this section, we will fix m and assume that the assumptions 
of Proposition 4.1 hold. We will also assume that Proposition 4.1 has been 
established for m — 1, which implies that g2,m ~ '"2,m- We will reduce the 
general case to the m = 2 case in which type 2 mutations occur at rate U2r2,m- 
The next two lemmas will allow us to ignore certain type 2 mutations. 

Lemma 4.3. Let Am be the event that in model Mi some type 2 mutation 
that occurs while there is another individual in the population of type 2 or 
higher has a type m descendant. Then P{Am) <^ ?"i,m- 

Proof. Let e > 0. Let B be the event that the number of individuals in 
the population of type 1 or higher never exceeds e~^r{^^^, so P{B^) < eri^m. 
Let U = {t: there is an individual of type 2 or higher alive at time t}. On B, 
type 2 mutations occur at rate at most e~^rj~^ti2 and have a type m de- 
scendant with probability q2,m- Therefore, letting \U\ denote the Lebesgue 
measure of U, we have 

P{Am) < en,™ + El\U\lB]e-^r^)^U2q2,m- 
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For k < e^^r^„-^, it follows from (3.6) that the expected amount of time for 
which there are k individuals of type 1 or higher is and so the expected 
number of type 2 mutations during this time is at most {l/k){ku2) =U2- 
Therefore, the expected number of type 2 mutations while there are at 
most £~^r^ln individuals of type 1 or higher is at most e~^r^l^U2. By 
(3.8), the expected amount of time for which these mutations or their off- 
spring are alive in the population is at most {C\ogN)e~^r^l^^U2- Therefore, 
i?[|C/|ls] < {C\ogN)e~^r^^^U2. Since g2,m ~ r2,m by the induction hypoth- 
esis and U2r2,m = fim-> it follows that there exists a constant C such that 

P{Ara) < en^m + C(log iV)e" V{;^nir2,^ = eri,„, + C{\ogN)e-'^U2. 

Conditions (ii) and (iii) imply that there exist constants Ci and C2 such 
that 

^^^^ < Cur"^-' log AT < C2u]l'-' log AT ^ 0. 
It follows that 

limsuprj^^P(^m) < e, 

which implies the lemma. □ 

Lemma 4.4. Let e > 0. Let Bm he the event that in model Mi some type 
2 mutation that occurs while there are fewer than sr^l^ individuals in the 
population of type 1 or higher has a type m descendant. Then there is a 
constant C , not depending on e, such that P{Bm) < Ceri^rn- 

Proof. As noted in the proof of Lemma 4.3, the expected number of 
type 2 mutations while there are k individuals of type 1 or higher is U2- 
Therefore, the expected number of type 2 mutations while there are fewer 
than er^]^ individuals of type 1 or higher is at most er^l^U2- By the in- 
duction hypothesis, each such mutation produces a type m descendant with 
probability qm ~ ?'2,m, so the probability that one of these mutations pro- 
duces a type 2 descendant is at most Cer^^r2,m^t2- The desired result now 
follows from the fact that U2r2^m = ^im- 

Our strategy is to show that we can reduce the problem to the m = 2 case 
by assuming that each type 2 mutation independently generates a type m 
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descendant with probability q2,m- Complicating this picture is the fact that 
the evolution of the number of type 1 individuals (which produce the type 2 
mutations) is not independent of the success of the type 2 mutations because 
a new individual of type j >2 may replace an existing type 1 individual and 
vice versa. To show that this is not a significant problem, we will construct 
a coupling of model Mi with another process in which this dependence has 
been eliminated. We first define model M2 to evolve like model Mi except 
that initially there are k individuals of type 1 and N — k oi type 0, and type 
2 mutations are only permitted when there are no individuals of type j >2. 
We then compare model M2 to model N2, in which the type 1 individuals 
are decoupled from type 2 individuals and their offspring by declaring that 
(provided a type individual exists): 

• if a proposed move exchanges a type 1 and a type j > 2, we instead 
exchange a type and a type j; 

• a mutation that occurs to a type 1 produces a new type 2 individual but 
replaces a type individual instead of the type 1 that mutated. 

To define the coupling precisely, introduce a Poisson process with rate 
at which the successive exchanges will occur and let in and jn be independent 
i.i.d. uniform on {1,2,..., N}. In both models, we replace individual in with 
a copy of individual j„. In model if in has type 1 and jn has type 2 
or higher, then we choose a type individual at random to become type 
1, so that the number of type 1 individuals stays the same. Likewise, if in 
has type 2 or higher and jn has type 1, then we choose a type 1 individual 
to become type in model N2- This recipe breaks down when there are no 
individuals of type 0. However, Lemma 4.5 shows that with high probability 
the number of individuals of nonzero type is o{N) up to time Tm- For the 
mutations, we have for each 1 < i < a Poisson process with rate U2, which 
in both models causes a mutation of the ith individual, unless either the 
ith individual has type or the ith. individual has type 1 and there is an 
individual of type 2 or higher in the population. In model N2, if a type 1 
individual mutates to type 2, a type individual is chosen at random to 
become type 1, to keep the number of type 1 individuals constant. 

Let Xi{t) and Yi{t) be the number of type 1 individuals at time t in 
models M2 and respectively. Let Z{t) = Xi{t) — Yi{t). Let X2{t) and 
Y2{t) be the number of individuals in models M2 and N2, respectively, of 
type greater than or equal to 2. Note that by renumbering the individuals 
as the process evolves if necessary, we can ensure that for all t > 0, at time t 
there are mm{Xi{t) ,Yi{t)} integers j such that the jth individual has type 
1 in both model M2 and model Note also that with the above coupling, 
if a type 2 mutation occurs at the same time in both models, descendants of 
this mutation will always have the same type in both models. This means 
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that if the mutation has a type m descendant in one model, then it will 
have a type m descendant in the other. Finally, as long as the number of 
individuals of nonzero type stays below N/2, we can also ensure that there 
is no j such that the jth individual has type 1 in one of the two models and 
type 2 or higher in the other. The lemma below, combined with condition 
(iv), ensures that in both models, the number of individuals of nonzero type 
stays much smaller than N. 

Lemma 4.5. Fix t>0. Suppose Xi{0) = Yi{0) = [er{)j and ^2(0) = 

12(0) = 0. Assume f is a function of N such that f{N)ri^rn. -^00 as N ^ 00. 
Then using — >p to denote convergence in probability, we have 

Proof. In model M2, individuals of type 1 or higher give birth and die 
at the same rate, so {Xi{s) + X2{s),s > 0) is a martingale and 

E[X,itr^X) + Mtri,l)] = Xi(0) + M^) = [eri,U 
By Doob's maximal inequality, if 5 > 0, then 

X^{s)+X2{s) ^ \ ^ E[Xiitr^'j + Mtr^^ln)] 
F max — — ^ > < 



o<.<tr- f{N) J- 5f{N) 

- SfiN) 

as — > 00, which implies the first statement of the lemma. 

In model N2, mutations of type 1 individuals cause new type 2 individuals 
to replace type individuals. Births and deaths occur at the same rate, so 
the process {Yi{s),s > 0) is a martingale, while (li(s) + 5^2(3), s > 0) is a 
submartingale. Now £^[Yi(s)] = [er^^] for all s, so the expected number of 
type 2 individuals that appear before time tr^l^ because of mutation is at 
most eri^m ■ t^im ' ^2 = £^2'"rm^- It follows that 

Now 

f4 4) u,r-' = ^ = ^ u'/''""^0 

^ ' 1/2 1/4 1/2™-! 1/2 1/4 1/2'"-! 2 ^' 

Uo Un ' * * Uttx lip Un ' ' ' ttfYi 
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because condition (ii) implies that the first factor is bounded by a constant, 
so Doob's maximal inequality this time gives 

Yi{s) + Y2{s) ^ A ^ £rY)n + ^^^ri^^t 



which implies the second half of the lemma. □ 

We now work on bounding the process {Z{t),t > 0). There are three types 
of events that cause this process to jump. First, whenever a type 1 individual 
in model M2 mutates to type 2, there is no corresponding change in model 
A''2, because any new type 2 individual in model N2 resulting from mutation 
replaces a type 0. These changes cause the Z process to decrease by one. 
Letting be the rate at which they are occurring at time t, we have 

0<l^{t)<U2Xi{t), 

where the second inequality could be strict because mutations are suppressed 
if there is already a type 2 individual in the population. 

Second, one of the "extra" type 1 individuals in one process or the 

other could experience a birth or a death. This could cause the Z process 
to increase or decrease by one. If Xi{t) > Yi{t), then at time t, both in- 
creases and decreases in the Z process occur because of such changes at rate 
\Z{t)\{N — \ Z{t)\)/N , because the Z process changes unless the other indi- 
vidual involved in the exchange was also one of the individuals that 
are type 1 in model M2 but not model A''2. If Yi{t) > Xi{t), then increases 
and decreases in the Z process occur at rate \Z{t)\{N — \Z{t)\ — Y2{t))/N 
because exchanges between a type 1 individual and an individual of type 2 
or higher are prohibited in model A''2. 

Finally, there are transitions in which one of the mm{Xi{t),Yi{t)} indi- 
viduals that are type 1 in both processes experiences a birth or death, but 
the other individual involved in the exchange is one of the Y2{t) individuals 
that has type 2 in model so the type 1 population does not change in 
model Such changes occur at rate Y2{t)inm{Xi{t),Yi{t)}/N . 

Thus, if we let 

^ _ \Z{mN - \Z{t)\ - Y2m{Y,it)>z,it)}) , f2(t)min{Xi(f),yi(t)} 
- N + N ' 

then at time t the Z process is increasing by 1 at rate X{t) and decreasing by 
1 at rate A(t) + The next result uses these facts to control the difference 
between Xi{t) and Yi{t). 

Lemma 4.6. Fixt>0. Let Zn[s) =ri^rnZ{sr^}n) for all s>Q. If Xi{0) = 
Yi{0) = er^^l^ and X2i0) = ^2(0) = 0, then 

max ZAr(s) ^pO. 

0<s<t 
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Proof. We will use Theorem 4.1 from Chapter 7 in [10] to show that 
Ztv converges to a diffusion with b(x) = 0, a{x) = 2\x\, and initial point 0, 
so the limit is identically zero. The first step is to observe that the Yamada- 
Watanabe theorem; see, for example, (3.3) on page 193 of [6], gives pathwise 
uniqueness for the limiting SDE, which in turn implies that the martingale 
problem is well posed. To verify the other assumptions of the theorem, define 

rt 

BN{t) = -l uisrVDds 



Jo 



and 



In view of the transition rates for the process {Z{t),t > 0), we see that 
at time s the process Zj\[{s) experiences positive jumps by the amount 
ri^m at rate A(srf^)rj~^ and negative jumps by the same amount at rate 
{X{sr{l^) + fi{sri.m))rYjn- Therefore, letting M7v(t) = Z^it) - B^it), the 
processes {Mj\r{t),t > 0) and (M^(i) — AN{t),t > 0) are martingales. To 
obtain the result of the lemma from Theorem 4.1 in Chapter 7 of [10], it 
remains to show that for any fixed T > 0, we have 



(4.5) 

and 
(4.6) 



sup |-B7v(t)| 

o<i<r 



,0 



sup 

0<t<T 



AN{t)- / 2\ZN{s)\ds 





,0. 



To prove (4.5), note that 



sup \BN{t)\<T sup fJ-itr^l^) <Tu2 max Xi{t). 



0<t<T 



0<t<T 



Since ri^rn/ {TU2) — > 00 by (4.4), (4.5) now follows from Lemma 4.5 with 
/(AT) = l/{Tu2). For (4.6), note that 

AN{t)- f\\ZN{s)\ds 
Jo 



rirr 



tf 2|Z(srr^)|2 2|Z(sri;^)|y2(sri;^)l|y,(,,,-i^)>z,(.r-^)} 



N 

N 



+ A^(s'^l,m)) 



ds. 



It suffices to control the absolute values of the four terms over all t <T. 
Note that Z(srj~^J < ma,x{Xi{sr^l^) ,Yi{sr^l^J} . Therefore, by Lemma 4.5 
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with f{N) = JN/ri^m, the three quantities maxQ^^^ -i r^^N 

V — — l.m ' 

maxQ<^<^^-i r|^^iV^^/^y2(s) and maXg^^^^^-i r|^„^^iV~^/^Xi(s) all converge 

in probability to zero as — > oo. This is enough to establish the convergence 
of the first three terms. The result for the third term follows from (4.5) and 
the fact that ri^m — > 0. □ 

In the model N2, types j >2 have the same relationship to type 1 indi- 
viduals as in the branching process. That is, type I's give birth to type 2's, 
but the fate of a type 2 family does not affect the number of type 1 indi- 
viduals because a type 1 individual cannot be exchanged with an individual 
of type 2 or higher. Lemma 4.3 has shown that we can ignore type 2 births 
that occur when another type 2 is present, so successive type 2 births give 
independent chances of producing a type m individual. We are now close 
to our goal announced in the Introduction of reducing the m-type problem 
to the 2-type problem with U2 = U2q2,m, that is, to the simplified model in 
which at each type 2 mutation, we flip a coin with probability g2,m of heads 
to see if it will generate a type m individual. 

Let model N2 be model modified so that if a type 2 mutation occurs 
when Y2{t) > 0, instead of suppressing this event entirely, we flip a coin with 
probability (72, m of heads. We then add a type m individual to the population 
if the coin is heads and otherwise make no change. Lemma 4.3 implies that 
the difference between the probability of getting a type m individual in 
model A''2 and the probability of getting a type m individual in model A'^^ 
tends to zero as N ^ 00. However, it is easier to prove the next result using 
model N2 because in model N2, each type 1 individual is giving rise to 
individuals that will produce a type m descendant at rate U2q2,m, regardless 
of whether there are other individuals in the population of type 2 or higher. 

Lemma 4.7. Let e > 0. Consider model N2 starting from [er^^] type 1 
individuals at time zero. Let h\i ,^ ^ he the probability that a type m individual 
is horn at some time. Then 

lim h\j =l-e-''. 

Af— >oo 

Proof. Consider a modified branching process in which type j individ- 
uals give birth at rate one, die at rate one, and give birth to type j + 1 
individuals at rate Uj+i. Let ^^^e probability that if the branching 

process starts with [eri^^] individuals, a type m individual is born at some 
time. Since different families are independent. Lemma 4.1 implies 

"'N,m,e ~ \^ Prn.) ^ -L e , 

where pm is the probability that a type 1 individual has a type m descendant. 
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We now compare this process to model A'^^. The number of type 1 individ- 
uals in model A'^^ jumps more slowly than the number of type 1 individuals 
in the branching process, but in both processes type 1 individuals give birth 
to type 2 individuals at rate tt2, and then type 2 individuals and their de- 
scendants evolve independently of the type I's. Therefore, if the probability 
P2.m that a type 2 individual in the branching process produces a type m 
descendant were equal to (72 mi 

then it would follow that h\i ^ > h?^ „^ ^ . 
Instead, we only have P2,m ~ 92, m because P2,m ~ f2,m by the remark after 
Lemma 4.1 and 52,™ ~ f2,m by the induction hypothesis. It follows that 

To get a bound in the opposite direction, observe that we can pick K ^ 00 
so that L = Kr^l^^ = o{N), and with probability tending to one as N ^ 00, 
the number of type I's does not reach L. Therefore, writing /ijvme 
^% m £ as functions of the rate at which type 1 individuals give birth to type 
2 individuals, we have 

/i]v,m,.(^2) < /i^,„,,(n2iV/(iV - L))(l + 0(1)) + 0(1) ^ 1 - e-^ 

which completes the proof. □ 

Lemma 4.8. Let e > 0. Consider model M2 starting from [sr^l^] type 1 
individuals at time zero. Let hN,m,e be the probability that a type m individual 
is born at some time. Then 

lim \hN,m,e - /i]v,m,£l =0. 

Proof. Recall the coupling between model M2 and model N2 described 
earlier in this section. With this coupling, if a type 2 mutation occurs at 
the same time in both processes, then it produces a type m descendant 
in one process if and only if it produces a type m descendant in the other. 
Consequently, it suffices to bound the probability that some type 2 mutation 
that appears in one process but not the other produces a type m descendant. 

Lemma 4.3 bounds this probability for mutations that occur in one model 
but get suppressed in the other because there are no individuals of type 2 or 
higher. It remains to consider the mutations experienced by the indi- 
viduals that are type 1 in one process but not the other. Pick s large enough 
so that the probability or M2 does not die out by time sr^l^ is < 6. Pick 
r] so that rjs < 6'^. By Lemma 4.6, if N is large, we have m.axt<s \ Z]\f{t)\ < rj 
with probability > 1 — 6. The expected number of births that occur in one 
process but not in the other before time srf^ when maxt<s \ ZN{t)\ < r] is 
bounded by 

2??^i"m • sr^)n^2 < 26^r^l^U2. 
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Using Chebyshev's inequality, it follows that with probability > 1 — 45 the 
number of type 2 mutant births that occur in one process but not the other 
is bounded by 5r^^ti2 = <5?^2^m- When this occurs, the success probabilities 
differ by at most 6 because each mutation has probability q2,m ~ 'f2,m of 
producing a type m descendant. Since 5 > is arbitrary, the desired results 
follow. □ 

Proof of Proposition 4.1. The probability that the number of indi- 
viduals of type greater than zero reaches [er^^] is l/[er^^]. If, at the time 

T when the number of individuals of type greater than zero reaches [er^^] , 
we change the type of all individuals whose type is nonzero to type 1, and if 
we disregard type 2 mutations that occur when there is another individual 
of type J > 2, then the probability of getting a type m individual after this 
time becomes hiy^rn,e- Since these changes of the types can only reduce the 
probability of getting a type m individual, we have 

(4.7) Qm > , \ hN,m,e- 

Also, for a type m individual to appear, either the type m individual must 
be descended from a type 1 individual that is alive at time T, or else the 
type m individual must be descended from a type 2 individual that existed 
before time T, so using Lemmas 4.3 and 4.4, it follows that 

(4.8) Qm < , \ hN,m,e + Csri^rn- 

The result follows by letting e — > 0. □ 

5. Proof of Theorem 2. In this section, we complete the proof of The- 
orem 2. The argument is based on the following result on Poisson approxi- 
mation, which is part of Theorem 1 of [2]. 

Lemma 5.1. Suppose {Ai)i,zx is a collection of events, where I is any 
index set. Let W = J2i£X number of events that occur, and let 

A = E[W] = J2ieT ^i^i)- Suppose for each i £T, we have i £ (3i CT. Let 
J='i = a{{Aj)j(zj\is.). Define 

&2 = E E mn^,), 

h = Y,E[\P{A,\J^,)-P{A,)\]. 
Then \P{W = 0) - e"^] < 6i + 62 + 63. 
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We will use the following lemma to get the second moment estimate 
needed to bound 62- When we apply this result, the individuals born at 
times ti and t2 will both be type 1. We call the second one type 2 to be able 
to easily distinguish the descendants of the two individuals. 

Lemma 5.2. Fix times ti <t2- Consider a population of size N which 
evolves according to the Moran model in which all individuals initially have 
type 0. There are no mutations, except that one individual becomes type 1 at 
time ti, and one type individual (if there is one) becomes type 2 at time t2- 
Fix a positive integer L < N/2. For i = 1,2, letYi{t) be the number of type 

1 individuals at time t and let Bi be the event that L < maxf>oli(t) < N/2. 
Then 

P{Bir\B2)<2/L^. 

Proof. Because {Yi{t),t > ti) is a martingale, it is clear that P{Bi) < 
1/L. Let si < S2 < • • ■ < •sj be the ordered times, after time t2, at which the 
Yi process jumps. For t>t2, let Z(t) = Y2(t)A{t), where 

,,,, iV-l^_ -p. N-Y,is,-) 

We claim that conditional on {Yi(t),t > ti), the process {Z{t),t > t2) is a 
martingale. 

To see this, note that between the times Sj, births and deaths of type 

2 individuals occur at the same rate, even conditional on (Yi(t),t > ti), so 
Z{t) experiences both positive and negative jumps of size {N — Yi[t2)) / {N — 
Yi{t)) at the same rate. At the time s^, if Yi{si) = Yi(sj— ) + 1, then one of 
the — Yi(sj— ) individuals of type other than 1 dies at time Sj, so we 
have Y2{si) = Y2{si—) — 1 with probability ai = Y2{si—)/{N — Yi(sj— )) and 
Y2{si) = Y2{si—) with probability 1 — a^. Note that 

(1 - ai)Y2{si-) + ai{Y2{s,-) - 1) = ^2(5- 

= Y2{Si- 
= Y2{si- 

Likewise, if Yi{si) = Yi(sj— ) — 1, then one of the 
type other than 1 gives birth at time Sj, so l2(si 
ability ai = Y2{si-)/{N - Yi{si-)) and Y2{si) = 
1 — a j , and we have 

(1 - ai)Y2is^-) + aiiY2isi-) + 1) = y2(s^- 



- ai 




N-Yi{si-) 



N-Yi{si) 
^N-Yi{s^-)- 

N — Yi{si—) individuals of 
) =Y2{si—) + 1 with prob- 
= Y2{si—) with probability 

) + «« 
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V N-Yi{si-) 

N-Yi{si) 
N-Yi{s,-)- 



) 



Y2{Si 



The martingale property follows because A{si) = A{si—){N — Yi{si—))/{N — 
Yi(sj)), compensating for the expected change in the Y2 process. 

Since {Z(t),t > t2) is a martingale conditional on (Yi(t), t > ti) and Z{t2) = 
1, we have P{Z{t) > L/2 for some t\Bi) < 2/L. On the event Bi, we have 
A{t) < 2 for all t>t2,so 



Since P{Bi) < 1/L, the result follows. □ 

We now introduce a set-up that will allow us to apply Lemma 5.1. Let 
e > 0, and let K he a large positive number that will be chosen later. Let 
Qm be the probability that in model Mi: 

• there is eventually a type m individual in the population, 

• the maximum number of individuals of nonzero type over all times is 
between e/ri^rn and and 

• the family lives for time < K/ri^rn] that is, there are no individuals of 
nonzero type remaining at time K/ri^m. 

We will call the second and third points the side conditions. Divide the 
interval [0, t/(A''ro,m)] into M subintervals of equal length, where Mri^m — > 
00 as N ^ 00. Label the intervals 7i, . . . , Im, and let Di be the event that 
there is a type 1 mutation in the interval Jj. 

For bookkeeping purposes, we will also introduce type lb mutations, 
which individuals of type greater than zero experience at rate ui but which 
do not affect the type of the individual. Mutations to type zero individuals 
will be called type la mutations, and the phrase "type 1 mutation" will 
refer both to type la and type lb mutations for the rest of this section. 
This will ensure that type 1 mutations are always occurring at rate exactly 
Nui. To determine whether or not the first type lb mutation in interval i 
is "successful," we let Ci)---)^M be i.i.d. random variables, independent of 
our process, that equal 1 with probability qm- 

Let Ai be the event that there is a type 1 mutation in the interval Jj and 
one of the following occurs: 

• The first type 1 mutation in /j has type la. The individual that gets this 
mutation has a type m descendant and the side conditions hold. That 
is, the maximum number of descendants of the mutation over all times 



P{B2\Bi) < P{Y2{t) > L for some t\Bi) 

< P{Z{t2) > L/2 for some t\Bi) < 2/L. 



28 



R. DURRETT, D. SCHMIDT AND J. SCHWEINSBERG 



is between ejrx^ra and iV/2, and there are no descendants of the muta- 
tion remaining in the population at the time K/ri^m after the mutation 
occurred. 

• The first type 1 mutation in /j has type lb, and ^i = \. 

As in Lemma 5.1, let W = Y^iLi '^Ai be the number of events that occur, 
and let A = -E[VF]. 

Lemma 5.3. limsup^v^o^ \P{W = 0) - e"^| = 0. 

Proof. Let j3i consist of all subintervals whose distance to is at most 
K/ri^m. Define hi, 62 and 63 as in Lemma 5.1. We first claim that 63 = 0. 
Suppose li = [a,h]. Note that the event Ai does not depend on the state of 
the population at time a. Also, because of the side condition that a mutation 
is not considered successful if it has descendants surviving for a time longer 
than K/ri^rn, the event Ai is determined by time b + K/ri^m and is therefore 
independent of the events Aj for j > i and j ^ Pi. Likewise, all of the events 
Aj for j <i and j ^ Pi are determined by the behavior of the process before 
time a, so these events are also independent of Ai. It follows that Ai is 
independent of {Aj)j^i3., and thus that 63 = 0. 

The length of the interval is t/{MNro^m), so since type 1 mutations 
occur at rate Nui, we have P{Di) < Nui\Ii\ = t/{Mri^m)- Since P{Ai\Di) = 
Qm, Proposition 4.1 gives 



There are at most 2{K / {ri^m\Ii\) + 1) intervals in Pi, so for large M 



Since Nui ^ by (i) and M 00, bi ^ 0. 

To bound b2, note that P{Di r\ Dj) < [t/{Mri^m)]^ because mutations 
in disjoint intervals occur independently. We now apply Lemma 5.2 with 
L = e/ri^rni ti being the time of the first mutation in /j, and t2 being the 
time of the first mutation in Ij. For the event Ai to occur, it is necessary 
for the event Bi considered in Lemma 5.2 to occur. Note that mutations do 
not effect the result of Lemma 5.2 because the side conditions involve all 
descendants of the original mutation, regardless of type. 



P{Ai) = qmP{Di) < tqm./{Mn^m) ~ t/M. 





P{AinA,\DinD,)<2rUe^ 
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and thus P{AinAj) < 2t^/{Mef. Since there are at most 2(Er/(ri,m|/i|) + 1) 
intervals in Pi, we have 

, K \ 2t^ 

62 < M • 2 —7 + 1 



n^ra\h\ ' 7 (Me)2 

2 A+2 4^2 



KMNromf t Y At'' . 



n^mt \MeJ Me2 Ms^' 
This shows 62 — > 0, and completes the proof. □ 

Lemma 5.4. Let am be the first time at which there is a type 1 individual 
in the population that will have a type m descendant. Then 

(5.1) lim P{am>t/{Nro^m))=eM-t)- 

N—^co 

Proof. To obtain (5.1) from Lemma 5.3, it suffices to show that there 
is a constant C such that for sufficiently large A^, we have |t — A| < Ce 
and \P{W = 0) - P{am > t/{Nro,ra))\ < Ce. The result wiU then follow by 
letting e — > 0. Clearly, qm^Qm-, and qm — qm is at most the probability that 
in model Mi, a type m individual appears even though either (a) the total 
number of individuals of nonzero type never exceeds eri^rn-, (b) the total 
number of individuals of nonzero type exceeds N/2, or (c) the family does 
not die out before K/ri^rn- Because Nri^„i 00, we have K/ri^rn < N for 
sufficiently large A^, so we can apply Lemma 3.1 to show that the probability 
that a given mutation survives for as long as K/ri^rn is at most Cri^m/K. 
Using Lemma 4.4, we get 

qm-qm< Ceri-m + 2/N + Cri^^/K. 

Since Nri^m ^ 00 by (iv), we have 2/N <^ ri^m, so if K is large, we get 

(5.2) qm- C£rl^rn<qm<qm■ 
'Note that 

A = ^ P{Ai) = ^ P{Di)qm = MP{Di)qm 
= Mg^(l- e-^"il^il) ~Mg„7Vni|/i| =tg„,/ri,^. 

Because 

qm ~ ^i,m by Proposition 4.1, this result combined with (5.2) implies 
\t — X\ < Ce for sufficiently large N. 

It remains to bound \P{W = 0)-P{am > t/(iVro,m))|- We can have W >0 
with am > t/{NrQ^m) only if for some i, there is a type lb mutation in Jj 
and = 1. Let X{t) be the number of individuals of nonzero type. As long 
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as X{t) stays below eN, type lb mutations occur at rate at most Neui, so 
the probability that this occurs is at most 

{eNui){t/Nro,rr,)q„, < Ce, 

using Proposition 4.1. Since individuals give birth and die at the same rate, 
{X{t),t > 0) is a submartingale. Also, E[X{t/{Nro^m))] is the expected num- 
ber of type la mutations before time t/(NrQ^m), which is at most t/ri^m- 
Therefore, by Doob's maximal inequality, 

P{X{s) > eN for some s < t/{Nro,m)) < t/{eNn^m), 

which goes to zero as ^ oo by condition (iv) . 

We can have = with < i/(A^?"o,m) in one of two ways. One pos- 
sibility is that there could be a successful type 1 mutation in one of the M 
subintervals that is not the first type 1 mutation in that interval. The ex- 
pected number of type 1 mutations in the ith interval that are not the first 
in their interval is at most {t/Mri^rn)"^ ■ Therefore, the probability that some 
successful type 1 mutation is not the first type 1 mutation in its interval 
is at most M{t/Mri^rn)'^qm < C/(Mri, m). Since Mri^Yn — OO) this probabil- 
ity tends to zero as N ^ oo. The other possibility is that there could be 
a successful type 1 mutation that does not satisfy the extra conditions we 
imposed. The probability that this occurs is at most 

{Nui)it/Nro,m){qm - Qm) < Cte 
by (5.2). This observation completes the proof of the lemma. □ 

The following result in combination with Lemma 5.4 implies Theorem 2. 

Lemma 5.5. We have 
(5.3) NrQ^rniTm — Cm) ^0 in probability. 

Proof. Let e > and S > 0. By Lemma 5.4, we can choose s large 
enough that for sufficiently large N, 

P{a^. > s/iNro,m)) < S/3. 

By Lemma 3.1, the probability that a type la mutation takes longer than 
time £/{NrQ^m) to die out or fixate is at most Cmax{l/A^, A^ro^m/^}- Be- 
cause the expected number of type la mutations before time s/NrQ^m is 
at most {Nui){s/Nro^rn) = uis/ro^m, it follows from Markov's inequality 
that the probability that some type la mutation that appears before time 
s/(A^ro,m) takes longer than time e/(A'ro,m) to die out or fixate is at most 
Csmax{ui/(A^ro,m), A^iii/e}- As A^ — > oo, the ffi'st of these terms goes to 
zero by (iv) while the second goes to zero by (i), so this probability is less 
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than 6/'i for sufficiently large A^. Finally, the probability that one of the 
type la mutations before time s/(A^?'o,m) fixates is at most 

s 1 

Nui 



since mutations occur at rate Nui and fix with probability 1/A^. This is 
less than 5/3 for large N by (iv). Hence, P{NrQ^m{Tm — CTm) > e) < 6 for 
sufficiently large N. □ 

6. The key to the proof of Theorem 3. Throughout this section and the 
next, we assume all of the hypotheses of Theorem 3 are satisfied. The main 
difficulty in proving Theorem 3 is to prove the following result. 

Proposition 6.1. Let e > 0. Consider a process which evolves according 
to the rules of model Mi but starting with [sN] type 1 individuals and all 
other individuals having type 0. Let gN,m{£) be the probability that either a 
type m individual is born at some time or eventually all N individuals have 
type greater than zero. Then 

limlimini gN,ni{^) = hm limsupe~^ gN,m{£) = ct, 

where a is as defined in (1.4)- 

The first lemma will allow us to ignore overlap between type 2 families. 

Lemma 6.1. With probability tending to one as N ^ oo, no type 2 in- 
dividual that is born while there is an individual of type in the population 
and another individual in the population of type 2 or higher will have a type 
m descendant. 



Proof. The argument is similar to the proof of Lemma 4.3. By (3.5), 
when we start with [Ne] type 1 individuals, the total number of births and 
deaths of individuals of nonzero type, before the number of individuals of 
nonzero type reaches or A^, is at most 2eN'^. Since individuals give birth 
and die at rate 1 and mutate at rate U2, the expected number of type 2 
mutations is at most eN'^U2. By (3.8), the expected amount of time during 
which there is an individual of type 2 or higher present in the population 
is at most Ce(A^^ log A^)n2. Type 2 mutations happen at rate at most Nu2 
and produce a type m descendant with probability q2,m, so the probabil- 
ity that a type 2 individual born while there is another individual in the 
population of type 2 or higher produces a type m descendant is at most 
CeN^ {log N)u2q2,m, which is at most 

(6.1) C{Nrl,r,^)\N log N)U2, 
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because U2r2,m = ffm 92, m ~ '''2,771 by Corollary 4.1. Also, we are as- 
suming Nri^^a — > 7 > and (ii) gives ri^rn ^ C^2 some constant 

C. Therefore, limsup^^o^ -^^^2 < which in combination with (iii) 

implies that 

(6.2) (iVlogiV)n2 ^0. 

It follows that the expression in (6.1) tends to zero as ^ oo. □ 

In view of Lemma 6.1, it suffices to prove Proposition 6.1 for model M2, 
in which no type 2 mutation can occur while there is another individual of 
type 2 or higher in the population. We will work with model M2 for the 
rest of this section. As in the proof of Theorem 1, we need to deal with the 
correlations between individuals of type 1 and of types j >2 caused by the 
fact that individuals of one positive type may replace another. To do this, 
we cut out the time intervals in which an individual of type 2 or higher is 
present in the population. 

Let Xi{t) be the number of type i individuals at time t. Let 

f{t) = supjs : ^ l{Xo(t)+Xi(t)=7V} du = t| 

and let Y{t) = Xi{f{t)), so the process {Y{t),t > 0) tracks the evolution 
of the number of type 1 individuals after one cuts out the times at which 
individuals of type j > 2 are present. Let (3o = 0. For i > 1, let (3i be the first 
time t after such that Y{t) ^ Y{t—) and there is no type two individual 
alive at time f{t) — , assuming such a time exists which it will a.s. as long 
as y(/9j_i) ^ {0, A^}. That is, the times /3j are the times of Y process jumps 
that happen because of a birth or death of a type one individual and do not 
involve the birth of a type two individual. Let g{t) = max{i : f3i <t}, so g{t) 
is the number of these jumps that have happened by time t. 

We now define a discrete-time process (Zj)^Q, which omits the jumps in 
Y due to time intervals being removed, but retains all of the other jumps 
of size 1. Let Zq = [Ne\. If i > 1, ^(A-i) i {0, A^}, and e^N < Zi^i < (1 - 
e2)A, then let Zi = Zi_i + 1 if Y{f3^) = Y{(3i-) + 1, and let Zi = Zi_i - 1 if 
Y{Pi) = Y{Pi—) — 1. Using this induction, we can define the process {Zi)f,^Q, 
where T = mi{i : Y{Pi) G {0, N},Zi< e^N, or Zi>{l- e'^)N}. On the event 
that e^N < Z,^i < (1 - e'^)N and < Y{l3i) < A, we have P{Z, = Zi_i + 
1\Zq, . . . , Zi^i) = P{Zi = — 1\Zq, . . . , Zj_i) = 1/2. We then continue the 
process for i > T by setting Zi to be Zi^i -|- 1 or — 1 with probability 
1/2 each, independently of the population process. The process (^i)i^o 
therefore a simple random walk. 

Note that T is smaller than the absorption time of the process {Zi)^Q in 
{0, A}, which can be compared to the absorption time of model Mq started 
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with [Ne] type 1 individuals. It therefore follows from (3.9) that E[(3t] ^ 
Thus, ii 6 > 0, then by Markov's inequality, 

(6.3) p((3t>^)<9. 



Likewise, since T is at most the number of births and deaths of individuals of 
nonzero type started from [Ne] such individuals, (3.5) gives E[T] < 2N'^e < 
2N^. Therefore, for e>0, 

(6.4) p(T>--\<e. 



Lemma 6.2. For all 6 >0, we have 

lim p( max \Y(t) - Z„u)\>6N] =0. 

N^oo \0<t<l3T ' J 

Proof. Let Co = and for i > 1, let Ci be the first time t after Q^i 
such that there is a type 2 individual alive at time f{t) — , provided such a 
time exists. Thus, the times d for i>l are the times at which the process 
{Y{t),t > 0) possibly jumps because we have cut out the lifetime of a type 
2 family. Every jump time of (Y{t),t > 0) is either Pi or Q for some i. Since 
only the jumps at the times Pi are incorporated into the process (Zj)^^, we 
have 

(6.5) Y{t) - = (YiQ) - Y{Q-)). 

i:Q<t 

We will show that the right-hand side is small because type 2 individuals 
are not alive in the population for a long enough time for large changes in 
the size of the type 1 population to happen during this time. 

A type 1 individual is lost whenever a type 2 individual is born. The other 
changes in the number of type 1 individuals that contribute to the right- 
hand side of (6.5) are births and deaths that occur while there are already 
type 2 individuals in the population. Let = 1 if the ith such event is a 
birth, and let = — 1 if the ith. such event is a death. Let J be the number 
of such events before time /(/3t)i so if = + • • • -|- then 

(6.6) \Y{t)- I < I {i : Ci < T} I + max I Sj \ 
for all t<PT- 

The first term on the right-hand side of (6.6) is the number of type 2 
mutations by time /3t, so as noted above its expected value is at most 
eN^U2. It follows from Markov's inequality and (6.2) that P(|{i:0 <T}\ > 
5N/2) < 4eN^U2/i6N) ^Oas N^oo. 



34 



R. DURRETT, D. SCHMIDT AND J. SCHWEINSBERG 



Since {Sj)JLi is a simple random walk, by the monotone convergence the- 
orem, the L^-maximal inequality for martingales, and Wald's second equa- 
tion, we have 



E 



max5^ 

J<J ^ 



lim E 

n— ♦oo 



max 5^ 

j<JAn \ 



<AlhnE[Sj^J 



4 lim E[J An]=4:E[J]. 



We have observed that the expected amount of time for which there is 
an individual of type 2 or greater present in the population is at most 
Ce{N'^ log N)u2- The rate at which type one individuals are either being 
born or dying is always at most 2N, so E[J] < 2Ce{N^logN)u2. By Cheby- 
shev's inequality and (6.2), 

hmsupF max LS,- > < hmsup— r— ^ 

'i2Ce{N\ogN)u2 ^ 
< lim sup = (J 

0^ 



N^oo 



and the result follows. □ 



Lemma 6.3. For all 6 >0, we have 



lim P 

N-*OD 







{Y{t)-Zg^t))dt 



>6N' 



0. 



Proof. Let > 0. By Lemma 6.2 and (6.3), 
/o 



lim sup P 

N^oo 



{Y{t)-Zg^t))dt 



>5N' 



N 



<limsup P /?T>^ max \Y {t) - Z > 56 N 

Letting ^ ^ gives the result. □ 



< 



Lemma 6.4. For all 6 >0, we have 

ri3T 



lim P 



N 



^2(iV-Z, 



>SN^]= 0. 



Proof. Fori<r-l, let 
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We need to show that 
(6.7) 



Hm Pi 

N-^oo \ 



i=0 



At time t, events that cause the number of type 1 individuals to change but 
do not involve the birth of a type 2 happen at rate 2Y(t){N — Y{t))/N. 
Therefore, if we define 



2Y{t){N -Y{t)) 



dt, 



then the random variables are independent and have the exponential 
distribution with mean one. Note that the process Y is constant on the 
intervals (/9i,/3j+i) except when type 2 mutations occur. For i <T — 1, let 



D, 



N 



2{N - Zi) 



Let 6* > 0, so P{T > 2N'^/9) < 9 hj (6.4). For < j < [2N'^/e], let Mj = 
^(T-i)Ai^,^ Let be the a-field generated by {Y(t),d< t < (jj). Note 
that E[D,i\Ti\ = 0, so the process (Mj)j^Q is a martingale. On the event 
that z < r — 1, we have Zi<{l — e'^)N, and hence 

Var(Al-^i) = —- ^ < 



1 



4(iV- Zi)2 - 4^4 ■ 

It follows from the L^-maximal inequality for martingales, and orthogonality 
of martingale increments that 



E{ max Mf]<iE\M, 

0<j<[2Ar2/f] J / ^ I 



2N^ 



1 



'[2Ar2/f] 



2N' 



4e4 9e^ 



Using Chebyshev's inequality, 



T-l 
i=0 



> 



<9 + P 



<9 + 



max 

0<j<[2AfVf] 

: f2N^ 



Since > was arbitrary, it follows that 



+ 



5N' 



96^e^N'^ ' 



(6.^ 



lim P\ 

N-*oo I 



T-l 

Ea 

i=0 



> 



0. 
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To convert this into a bound on the Di, we note that 



I A - A 



N 



2{N - Z, 



ft+i 2Y{t){N -Y{t)) 



N 



dt - (A+i - (3i)Zi 



< 



ft4 



Y{t)iN-Yit)) 



N-Z, 



dt. 



On the event that \Y{t) — Zg(i-^ \ < jN for aW <t < Pt, there is a constant 
Cs depending on e such that for all i <T — 1 and t £ /Jj+i], we have 

Y{t)iN - _ (Z,+7iV)(jV-Z, + 7jV) _ ^ 



N-Zi 



N-Zi 



<{Zi+^N)[l + ^] - Z,<CelN, 



where in the second inequahty we have used Zi < {1 — e'^)N. For a bound in 
the other direction, we note that 



Yit){N-Yit)) 
N - Z, 



^ {Z,-^N){N-Z,-jN) 
N -Zi 



Thus, if we let > and 7 = 69/2Cs, then for sufficiently large N, 



P 



T-l 
i=0 



> 



Using (6.3), Lemma 6.2, and letting — > 0, we get 



(6.9) 



lim P\ 



T-l 

E(A-A 

i=0 



> 



0. 



Now (6.7) follows from (6.8) and (6.9). □ 

Let D be the event that either Zt > (1 — e^)A^ or some type 2 mutation 
that occurs before time /(/St) lias a type m descendant. 



Lemma 6.5. We have 



hm i{l-P{D))-E 



T-l 



exp -r2,m 



U2N 



o2(iV-^0 



0. 
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Proof. If there is no type 2 individual in the population at time t, then 
the rate at which a type 2 individual is born is U2Xi{t). Because no type 2 
mutations occur while there is another type 2 individual in the population, 
each mutant type 2 individual independently has a type m descendant with 
probability q2,m- It follows that there is a mean one exponential random 
variable such that some original type two individual born before time 
/{Pt) has a type m descendant if and only if 



(6.10) 



e< / Y{t)u2q2,mdt. 
Jo 



Because changes in the population resulting from the birth of a type 2 



individual are not recorded in the process {Zi)J^Q , the random variable 
^ can be constructed to be independent of the process Therefore, 
by conditioning on {Zi)J~f^ , we get 



P[ {ZT<e^N}r\{^>r2 



(6.11) 



E 



exp 




The event that D fails to occur is the same as the event that Zt < e'^N 
and that (6.10) fails to occur. It follows that the difference between P{D^) = 
1 — P{D) and the probability in (6.11) is at most the probability that is 
between Jq^ Y{t)u2q2,mdt and r2^rn 

Y.J=oU2N/{2{N - Zi)). To bound the 
difference between these quantities, note that Lemmas 6.3 and 6.4 give 



lim P\ 



/ U2Y{t)dt -Y.- 
Jo ~k 



U2N 



^2(iV-Z0 



for all (5 > 0. Since r'l^ = U2r2,m and {Nri^m 
stays bounded as ^ 00 and it follows that 



> SN^U2j = 
7, we see that N'^U2r2 ■ 



(6.12) lim P\ 

N-*oo I 



Jo ~k 



U2N 



Z, 



> 







for ah 6>0. Also, q2,m ~ r2,m by Corollary 4.1 and P{Pt > N/O) < 6* by 
(6.3). Since N'^U2r2^m stays bounded. 



(6.13) 



lim sup P 



U2q2,mY{t) dt 



< lim sup P( Nu2PT\'f2,', 



U2r2,m. 
> - 



Y{t)dt 



0. 



> 
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Combining (6.12) and (6.13) gives 



lim PI 

N~>oo I 



/ U2q2,mY{t) dt - r2.m V ■ 
-'0 ' 



U2N 



Since 



T-l 



U2N 



1=0 



2{N - Zi) 



5 < e < r2,m E 



o2(iV-^.) 
U2N 



>6 



0. 



T-l 



it follows that 

limsup {1-P{D))-E 



exp 



T-l 

-r2,m 



^ 2(iV-Z,) 

U2N V 



+ 5] <26, 



and the result follows by letting 6^0. □ 



^{ZT<e3N} 



<26, 



Let A be the event that either Y{t) = N for some t, or a type m individual 
is born at some time. 

Lemma 6.6. There exists a constant C, not depending on e or N, such 
that 

\P{A)-P{D)\<Ce'^. 

Proof. Let 5>0, and assume that \Y{t) - Zg(j^-^ \ <6N iov {)<t< [5t- 
First, suppose D occurs. If a type 2 mutation that occurs before time /{Pt) 
has a type m descendant, then A must occur. If Zt > (1 — e^)-^, then 
Y^Pt) > (1 — £^ — S)^, and conditional on this event the probability that 
Y{t) = N for some t, in which case A occurs, is at least 1 — e'^ — 6. Therefore, 
using Lemma 6.2, 

limsupPiDnA") <e^ + 6. 

Now, suppose occurs. Note that if 5 < e"^ and \Y{t) — Zgi^i^ \ < 5N for 
< t < Pt, then we cannot have Y{Pt) £ {0, N}, which means we must have 
Zt < e'^A^ and, therefore, Y{Pt) ^ {^^ + S)N. Conditional on this event, the 
probability that Y{t) = N for some t is at most + 6, and the probability 
that one of the type one individuals at time /(/3t) has a type m descendant 
is at most (e^ + 6)Nqi^m- From these bounds and Lemma 6.2, it follows that 

limsupP(I?" n A) < (1 + 7^/2)(e^ + <5). 
The lemma follows by letting (5^0. □ 



Now let {Bt)t>o be a Brownian motion with Bq = e. Let U = inf{t : Bt = 
£3 or Bt = l-e'^}. 
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Lemma 6.7. 



We have 
expl 



lim E 

N~*oo 



U2N 



exp -- 



T-l 

u 



1 







dt]l 



Proof. Define a process {Wt)t>o such that Wt = N ^Zrjyati. Let i? 



mf{t ■.Wt<e^ or Wt>l-e^}. Note that = T/N^ and 1 



{ZT<e^N} = '^{WR<e^} 



on the event that for some 5 < e^,we have \Y{t) — Zgfj;^ \ < 5N for < t < 
which by Lemma 6.2 happens with probability tending to one as — > 00. 

and , write X^^^ 



Let < e^. For random variables X'^' and ^jy^ write X 
all r/ > 0, there is an N{rj) such that if iV > N{r]) then |X^^VX^ 



X}^^ if for 



1| < ry 



on the event that \Y{t) - Zg(t) \ < JiV for < t < /3t. We have 



1 



1 



2 Jo l-Wt 



dt 



1 [R 



1 



2 Jo 1 - N-^Z 



■dt 



[NH] 



2 Jo 



1 



l-N-'^Z, 



-N-^ds 



N' 



N 







Since M2''2 



f and (A^ri 
U2N 



T-l 

^^2,™ E 2(iV - 



2(iV-%]) 

^ — > 7, we have 

T-l 



N 



i=Q 



N 



2{N - Zi 



R 



In view of Lemma 6.2, it follows that 



2 Jo l-Wt 



■dt. 



lim E 



T-l 



exp 



-r2,m E 



U2N 



E 



exp 



7 



o2(iV-^i 
i? 1 



I '^{ZT<e'^N} 



■dt]l 



2 Jo l-Wt 

Thus, it suffices to show that for all A > 0, we have 



{Wii=e'^} 



(6.14) 



lim E 

N^oo 



exp —A 







:E 



exp 



-A 



l-Wt 
u 1 



■dt\l 



dt\l 



l-Bt 

Since (Zi)i^o ^ simple random walk, iWt)o<t<s converges weakly as 
^ 00 to {Bt)o<t<s for all s > 0. Let D[0, s] be the set of real- valued func- 
tions defined on [0, s] which are right continuous and have left limits. If 
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g : D[0, s] — > M is bounded, and if the set of points at which it is not contin- 
uous has Wiener measure zero, then the weak convergence of (Wt)o<t<s to 
iBt)o<t<s imphes that hniTv^oo -E'b((^i)o<t<s)] = E[g{{Bt)o<t<s)]- There- 
fore, 

expf-A / -, dt] l{Wn^^=e'^} 



(6.15) 



hm E 

N^oo 



E 



exp 



-A 



UAs 



l-Wt 
1 



dt\l 



/o l-Bt 

Note that if w : [0, s] ^ M is continuous, then the function g used in (6.15) is 
continuous at uj unless either inf{t : uj{t) = e^} < inf{t : u}{t) < e^} or inf{t : uj{t) 
1 — e^} < inf{t : uj{t) > 1 — e^}, which would happen if uj reaches a local min- 
imum when it first hits e'^ or a local maximum when it first hits 1 — . 
Brownian motion paths almost surely do not have this property, so (6.15) is 
valid. Finally, (6.14) follows from (6.15) by letting s — > oo. □ 

Let V = inf{t :Bt = OoT Bt = 1}. 

Lemma 6.8. Let I{s) = dt. // A > 0, there is a constant C such 

that 



(6.16) |^[exp(-A/([/))l 



{Bu-- 



E[eM-^I{V))l{By 



<Ce\ 



Proof. Define a process {B[)t>Q by B[ = Bu+t- Let = inf{t : B[ = a}. 
Let Di be the event that Bjj = 1 — and By = 0. Let D2 be the event that 
Bjj = and r^^j < ''"o- -^3 the event that Bjj = and Tq > e^. Note 
that on the event {Di U 1)2 U D^^Y, we have Ij^^^^s} = l{By=o} ^iid on this 
event we have 

rV I fU I 

0< -T-frd^- r—f^dt<2{V-U)<2e^ 
Jo I — Bt Jo i-- Bt 

It follows that the left-hand side of (6.16) is at most P{Di) + P{D2) + 
P{D3) + 2Xe^. 

Because Brownian motion is a martingale, we have P{Di) < P{Bv = 
0\Bu = 1 - £2) = £2 and likewise P(D2) < 2£^ Therefore, it remains only to 
show that P{D3) < Ce^ . By the reflection principle, 

ip(r^<£2|S^ = £3)=p(i?;,<0). 

Also, P{B'^2 > e^\Bu = e^) = 1/2. Therefore, P(0 < B'^2 < e^\Bu = e^) = 
[1 - P{tI) < e^\Bu = e^)]/2. It follows that 

P{D'i)<P{T'^>e'\Bu = e'') 

= 2P{0<B'^2 <e^\Bu = e^) 
2e^ n [2 



< 



V2 



7r£^ 
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and the result follows. □ 

Lemma 6.9. Let denote expectation for the Brownian motion {Bt)t>o 
starting from Bq = x. Let 

Then lim^^^o ^^""'^(l — u{x)) = a, where a is as defined in (1-4)- 

Proof. We choose / so that /(O) = 1 and /(I) = 0. Let g{x) = 7/[2(l - 
x)]. Then for < x < 1, we have u{x) = Ex[f{Bv) exp(— g{Bs) ds)]. Clearly 
u(0) = 1 and = 0. By the Feynman-Kac formula (see (6.3) on page 161 
of [6]), ii v: [0, 1] — > R is a bounded continuous function such that v{0) = 1, 
v{l) = 0, and ^v"{x) — g{x)v{x) = for x G (0,1), then u{x) = v{x) for 
X G [0,1]. Note that (6.3) on page 161 of [6] requires g to be bounded on 
(0, 1), which it is not in this example. However, the result nevertheless holds 
because g is nonnegative and, therefore, exp(— Jq g{Bs) ds) is always in [0, 1]. 

Multiplying by 2(1 — x), we can write the differential equation above as 
(1 — x)v"{x) — 7f (x) = 0. Let 

oo 

(6.17) -(-)=-E ^1(^(1--)'' 



k-. 



where c = 1/X;r=i7''/^K^- !)!• Note that t;(0) = 1 and ^(1) = 0. The series 
converges absolutely and uniformly on all compact subsets of M and can be 
differentiated twice term by term, so 

(1 - x)^;"(x) = eg j^^^^Kk - 1)(1 - xf-'. 

Therefore, 

oo / k-\-l \ 

(1 - - T^(.) = ^g(s(bi^<' - - WkTTifi - = »• 

Thus, v{x) = u{x) for x G [0, 1]. From our formula, it follows that 

lim = _^'(o) = c y = «, 

x-»o X ^ ' ^^{k - l)\{k - 1)\ 

as claimed. □ 



Proof of Proposition 6.1. The only difference between gNji^) and 
P{A) is that the event A is defined using model M2, in which new type 
two individuals cannot be born while there is an existing individual of type 
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2 or higher in the population. Therefore, it foUows from Lemma 4.3 that 
\P{A) — gj\fj{e)\ ^ [Ne]ri^rn and, therefore, 

hm \P{A)-gMAe)\ = 

for all £ > 0. By Lemmas 6.5, 6.6, 6.7 and 6.8, 

limsup|P(A) - (1 -«(£))! < Ce^. 

Af— >oo 

Combining these results and multiplying both sides by gives 
limsup|e"^5f7V,m(e) - e"^(l - u(e))| < Ce. 

Therefore, by Lemma 6.9, 

limliminfe~"'^(j(Arm(e) > lim(e~^(l — u{e)) — Ce) = a, 

£— >0 N^oo ' e^O 

lim limsnpe~^ gN,m.{£:) < lim(e"^(l — u{e)) + Ce) = a 

£-►0 Tv^oo ' "^^0 

and the proposition follows. □ 

7. Proof of Theorem 3. With Proposition 6.1 established, the rest of the 
proof is routine. 

Lemma 7.1. Consider model Mi, and let q'^ be the probability that either 
a type m individual is born at some time, or at some time all individuals in 
the population have type greater than zero. Then limjv-»oo — ^■ 

Proof. The probability that the number of individuals of type greater 
than zero reaches [eN] is l/[eiV]. If, at the time T when the number of 
individuals of nonzero type reaches [eA^], we change the type of all these 
individuals to type 1, then the probability of either getting a type m indi- 
vidual or eventually having all N individuals of type greater than zero is 
9N,mi£)- Since changing the types in this way can only reduce the probability 
of interest, we have 

To get an upper bound, note that the probability of either having a type 
m individual that is descended from a type 1 individual at time T or having 
all N individuals of nonzero type is at most g]\f,m{£)/[£^]- The only possi- 
bility not accounted for is that the type m individual could be descended 
from a type 2 individual that is born before time T. However, by Lemma 
4.4, the proof of which is valid under our hypotheses by Corollary 4.1, the 
probability that a type 2 mutation that occurs while there are fewer than 
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er^m individuals in the population of type 1 or higher has a type m de- 
scendant is at most Ce/N, where we are using that ri^m is 0{N). It follows 
that 

The result follows from Proposition 6.1 by first letting N ^ oo and then 
letting e^O. □ 

Proof of Theorem 3. As in the proof of Theorem 2, caU ordinary 
type 1 mutations type la, and give each individual of type greater than 
zero a type lb mutation at rate ui. Mutations of type la and lb will both 
be called type 1 mutations. Let ji be the time of the ith type 1 mutation, 
so the points (71)^^1 form a rate Nui Poisson process on [0, cxo). Define a 
sequence {Ci)i^i such that Q = 1 ii the mutation at time 7^ is a type la 
mutation and has a type m descendant in the population at some later time 
(which will always happen if the mutation fixates). Let (Ci)i^i be a sequence 
of i.i.d. random variables, independent of the population process, such that 
P{Ci = 1) = q'^ and P{Ci = 0) = 1 — for all i. Let Ci = d if all individuals 
at time 7^— have type 0, and let = Q otherwise. Let a'^ = inf{7j : (I = 1}. 
It is clear from the construction that a'^ has the exponential distribution 
with rate Nuiq'^, so Lemma 7.1 gives 

(7.1) lim P{uia'^> t) = exp{—at). 

Let dm = inf{7j : Q = 1}) which is the first time at which a type la muta- 
tion occurs and the individual that gets this mutation will eventually have 
a type m descendant. We claim that P{cr'^ = (Jm) ^ 1 as — > 00. We can 
only have a'^ / Um if there is a type 1 mutation at some time 7^ < cr[^ such 
that not all mutations at time 7^— have type and either Cj = 1 or (^j = 1. 
Note also that in this case the first such 7^ must occur before any type 
1 mutation fixates, so it suffices to consider the 7^ that occur before any 
fixation. Fix i > 0. The expected number of type 1 mutations before time 
Ui^t is {Nui){ui^t) = Nt, so by (3.8), the expected amount of time before 
u~[^t and before any type 1 mutation fixates that there is an individual of 
nonzero type in the population is at most C(iV log A^)t. Therefore, the ex- 
pected number of type 1 mutations that occur before this time is at most 
C{N'^ log N)uit. If such a birth occurs at time 7^, the probability that either 
Q or Q equals one is at most 2q'^, so 

P(a™ ^a'^< u-H) < C{N^ log N)uit(i^ ^ 0, 

where we are using that ui{N\ogN) — > by (ii) and (6.2) and that is 
0(1/A^) by Lemma 7.1. The fact that P{a'^ = am) ^ 1 as iV ^ 00 follows 
from this result and (7.1). 
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It remains only to show that ui — am) — >p 0. When the type 1 mutation 
at time am does not fixate, Tm — CFm is at most the time that it takes before 
all descendants of the mutation die out. When this mutation fixates, then 
Tm — Cm includes both the time to fixation plus the time for one individual to 
get m — 1 additional mutations. The probability that a given type 1 mutation 
takes time eu^^ to fixate or die out is at most Cuie~^ logiV, so the probabil- 
ity that some mutation that occurs before time u^^t takes this long to fixate 
or die out is at most C{Nui){u^^t){uie^^\ogN), which approaches zero as 

— > oo because iii ( A'" log A'^) — > 0. Finally, if a type 1 mutation fixates, then 
the time until a type m mutation appears can be calculated using the m — 1 
case of Theorem 2 with U2, ■ ■ ■ , Um in place of ui, . . . , Um-i- The hypotheses 
are satisfied by the arguments given in Corollary 4.1. Theorem 2 implies that 
the waiting time is 0(1/ {Nu2r2,m.)) ■ However, l/{Nu2r2^m) <C w^""^ because 
ui/u2 <bi^ by (ii) and Nr2^m — > oo as shown in the proof of Corollary 4.1. 
These observations imply ui{Tm — CTm) — >p 0, as in the proof of Theorem 2. 
□ 
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